NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Individual Showcase: What do high school students experience and learn during a two-day datathon?

Jessen_Eller, Kathryn; Schneider, Jillian; Fraser, Hamish; Warner, Jeremy; Uzun, Ece; Jain, Sandeep; De_Matos, Joao_Carlos_R_G; Rademacher, Doug; Bopardikar, Anushree; Cassidy, Michael; et al (February 2025, Data Science for Everyone)

What do high school students learn from a two-day datathon during which they tackle data to visualize the impact of biased data on healthcare decisions? How do they interact with their team of high school students, data scientists, clinicians, and teachers? What did we, the developers and leaders of the datathon, learn? How would we approach it differently next year? Our goal is to answer these questions plus share lessons learned. We will then divide the audience into teams to brainstorm ways to approach and solve some of the problems we experienced and hopefully recruit some audience members to participate in our June 2025 Brown University Health Artificial Intelligence (AI) Systems Thinking for Equity (HASTE) Datathon in Providence, Rhode Island (Brown University Datathon, 2024).
more » « less
Free, publicly-accessible full text available February 17, 2026
Collaborative large language models for automated data extraction in living systematic reviews

https://doi.org/10.1093/jamia/ocae325

Khan, Muhammad Ali; Ayub, Umair; Naqvi, Syed_Arsalan Ahmed; Khakwani, Kaneez_Zahra Rubab; Sipra, Zaryab_bin Riaz; Raina, Ammad; Zhou, Sihan; He, Huan; Saeidi, Amir; Hasan, Bashar; et al (January 2025, Journal of the American Medical Informatics Association)

Abstract ObjectiveData extraction from the published literature is the most laborious step in conducting living systematic reviews (LSRs). We aim to build a generalizable, automated data extraction workflow leveraging large language models (LLMs) that mimics the real-world 2-reviewer process. Materials and MethodsA dataset of 10 trials (22 publications) from a published LSR was used, focusing on 23 variables related to trial, population, and outcomes data. The dataset was split into prompt development (n = 5) and held-out test sets (n = 17). GPT-4-turbo and Claude-3-Opus were used for data extraction. Responses from the 2 LLMs were considered concordant if they were the same for a given variable. The discordant responses from each LLM were provided to the other LLM for cross-critique. Accuracy, ie, the total number of correct responses divided by the total number of responses, was computed to assess performance. ResultsIn the prompt development set, 110 (96%) responses were concordant, achieving an accuracy of 0.99 against the gold standard. In the test set, 342 (87%) responses were concordant. The accuracy of the concordant responses was 0.94. The accuracy of the discordant responses was 0.41 for GPT-4-turbo and 0.50 for Claude-3-Opus. Of the 49 discordant responses, 25 (51%) became concordant after cross-critique, increasing accuracy to 0.76. DiscussionConcordant responses by the LLMs are likely to be accurate. In instances of discordant responses, cross-critique can further increase the accuracy. ConclusionLarge language models, when simulated in a collaborative, 2-reviewer workflow, can extract data with reasonable performance, enabling truly “living” systematic reviews.
more » « less
Free, publicly-accessible full text available January 21, 2026
Utilization of COVID-19 Treatments and Clinical Outcomes among Patients with Cancer: A COVID-19 and Cancer Consortium (CCC19) Cohort Study

https://doi.org/10.1158/2159-8290.CD-20-0941

Rivera, Donna R.; Peters, Solange; Panagiotou, Orestis A.; Shah, Dimpy P.; Kuderer, Nicole M.; Hsu, Chih-Yuan; Rubinstein, Samuel M.; Lee, Brendan J.; Choueiri, Toni K.; de Lima Lopes, Gilberto; et al (October 2020, Cancer Discovery)
null (Ed.)
Full Text Available

Search for: All records